-
-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/1263 update stansummary to report rank-normalized ESS tail, ESS bulk, max abs deviation(MAD), and Rhat #1290
base: develop
Are you sure you want to change the base?
Conversation
This PR is a WIP and will fail until Stan PR stan-dev/stan#3305 is merged. |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
…an-dev/cmdstan into feature/1263-new-rhat-summary
changes to stansummary are running afoul of this PR: #972 |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
in the interest of avoiding omnibus PRs, restored the venerable |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
…an-dev/cmdstan into feature/1263-new-rhat-summary
I'm in transit today so hard to review, will try to approve tomorrow morning. It does look like there are a few makefile changes around the print utility that still need to be reverted? |
checked in reverted makefiles. |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
src/cmdstan/stansummary_helper.hpp
Outdated
if (reorder_params) { | ||
pnames = order_param_names_row_major(param_names); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading this correctly, it is only reordering the names, not the actual numerical values?
Before:
Inference for Stan model: issue_342_model
1 chains: each with iter=(10); warmup=(0); thin=(1); 10 iterations saved.
Warmup took 0.23 seconds
Sampling took 0.0018 seconds
Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -15 1.2 3.7 -2.6e+01 -15 -13 10 5.7e+03 0.92
accept_stat__ 0.69 -nan 0.30 0.20 0.86 1.0 -45 -2.6e+04 0.92
stepsize__ 0.89 nan 0.00 0.89 0.89 0.89 nan nan nan
treedepth__ 0.90 0.10 0.32 0.00 1.0 1.0 10 5.7e+03 nan
x[1] 0.61 -nan 0.22 2.8e-01 0.67 0.98 -48 -2.7e+04 0.89
x[2] 0.48 0.051 0.16 1.8e-01 0.48 0.77 10 5.7e+03 0.99
y[1,1] 0.43 0.12 0.39 2.3e-02 0.17 0.93 10 5.7e+03 1.1
y[1,2] 0.60 0.076 0.24 2.5e-01 0.71 0.88 10 5.7e+03 0.90
y[1,3] 0.47 -nan 0.33 4.5e-02 0.71 0.94 -92 -5.2e+04 0.90
y[2,1] 0.51 -nan 0.27 1.7e-01 0.49 0.93 -354 -2.0e+05 0.91
y[2,2] 0.56 0.087 0.28 1.6e-01 0.65 0.99 10 5.7e+03 0.94
y[2,3] 0.52 0.100 0.32 1.9e-03 0.65 0.93 10 5.7e+03 1.2
Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).
This PR:
Inference for Stan model: issue_342_model
1 chains: each with iter=10; warmup=1000; thin=1; 10 iterations saved.
Warmup took 0.23 seconds
Sampling took 0.0018 seconds
Mean MCSE StdDev MAD 5% 50% 95% ESS_bulk ESS_tail R_hat
lp__ -15 1.2 3.7e+00 0.78 -21 -15 -13 5.0 5.0 1.3
accept_stat__ 0.69 0.096 3.0e-01 0.25 0.21 0.78 0.99
stepsize__ 0.89 nan 1.2e-16 0.00 0.89 0.89 0.89
treedepth__ 0.90 0.10 3.2e-01 0.00 0.45 1.0 1.0
x[1] 0.61 0.069 2.2e-01 0.11 0.28 0.67 0.87 5.0 5.0 0.91
x[2] 0.48 0.051 1.6e-01 0.16 0.26 0.45 0.69 5.0 5.0 1.0
y[1,1] 0.43 0.12 3.9e-01 0.15 0.059 0.16 0.93 5.0 5.0 1.4
y[1,2] 0.51 0.087 2.7e-01 0.38 0.17 0.49 0.90 5.0 5.0 1.4
y[1,3] 0.60 0.076 2.4e-01 0.18 0.25 0.66 0.85 5.0 5.0 1.1
y[2,1] 0.56 0.087 2.8e-01 0.31 0.23 0.53 0.93 5.0 5.0 0.92
y[2,2] 0.47 0.10 3.3e-01 0.39 0.068 0.48 0.86 5.0 5.0 0.99
y[2,3] 0.52 0.100 3.2e-01 0.21 0.036 0.54 0.93 5.0 5.0 1.3
Samples were drawn using hmc with nuts.
For each parameter, ESS_bulk and ESS_tail measure the effective sample size
for the entire sample (bulk) and for the the .05 and .95 tails (tail),
and R_hat measures the potential scale reduction on split chains.
At convergence R_hat will be very close to 1.00.
The variable names are in the same column, the means etc are jumbled
This also means we apparently have no tests that check that the correct name is actually being paired with the correct mean, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the names are what is used to generate the stats.
cmdstan/src/cmdstan/stansummary_helper.hpp
Lines 365 to 370 in 04b0016
void get_stats(const stan::mcmc::chainset &chains, const Eigen::VectorXd &probs, | |
const std::vector<std::string> ¶m_names, | |
Eigen::MatrixXd &stats) { | |
stats.setZero(); | |
size_t i = 0; | |
for (std::string name : param_names) { |
under the hood, the chainset object translates names to indices.
https://github.com/stan-dev/stan/blob/9048555070493679287a25086e7b85b72ad9531f/src/stan/mcmc/chainset.hpp#L133-L148
in the above output, where are things jumbled?
I see - will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this time for sure!
Inference for Stan model: issue_342_model
1 chains: each with iter=10; warmup=1000; thin=1; 10 iterations saved.
Warmup took 0.23 seconds
Sampling took 0.0018 seconds
Mean MCSE StdDev MAD 5% 50% 95% ESS_bulk ESS_tail R_hat
lp__ -15 1.2 3.7 0.78 -21 -15 -13 5.0 5.0 1.3
accept_stat__ 0.69 0.096 0.30 0.25 0.21 0.78 0.99
stepsize__ 0.89 nan 0.00 0.00 0.89 0.89 0.89
treedepth__ 0.90 0.10 0.32 0.00 0.45 1.0 1.0
x[1] 0.61 0.069 0.22 0.11 0.28 0.67 0.87 5.0 5.0 0.91
x[2] 0.48 0.051 0.16 0.16 0.26 0.45 0.69 5.0 5.0 1.0
y[1,1] 0.43 0.12 0.39 0.15 0.059 0.16 0.93 5.0 5.0 1.4
y[1,2] 0.60 0.076 0.24 0.18 0.25 0.66 0.85 5.0 5.0 1.1
y[1,3] 0.47 0.10 0.33 0.39 0.068 0.48 0.86 5.0 5.0 0.99
y[2,1] 0.51 0.087 0.27 0.38 0.17 0.49 0.90 5.0 5.0 1.4
y[2,2] 0.56 0.087 0.28 0.31 0.23 0.53 0.93 5.0 5.0 0.92
y[2,3] 0.52 0.100 0.32 0.21 0.036 0.54 0.93 5.0 5.0 1.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reordering needs to happen before the stats are computed; not when the stats are written. fixed code and added unit test for this.
…an-dev/cmdstan into feature/1263-new-rhat-summary
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, something breaks when you have array-of-tuples.
Here's a model:
parameters {
tuple(real, matrix[2,3]) x;
array[3,2] tuple(real, real) y;
}
model {
x.1 ~ normal(0, 1);
to_vector(x.2) ~ normal(0, 4);
for (i in 1:3) {
y[i,1].1 ~ normal(0, 1);
y[i,2].1 ~ normal(0, 2);
y[i,1].2 ~ normal(0, 3);
y[i,2].2 ~ normal(0, 4);
}
}
current stansummary:
Inference for Stan model: foo_model
1 chains: each with iter=(1000); warmup=(0); thin=(1); 1000 iterations saved.
Warmup took 0.018 seconds
Sampling took 0.027 seconds
Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -9.6e+00 0.17 3.2 -16 -9.2e+00 -5.2 368 1.4e+04 1.0
accept_stat__ 0.87 4.0e-03 0.13 0.62 0.90 1.0 1025 37961 1.0
stepsize__ 0.62 nan 0.00 0.62 0.62 0.62 nan nan nan
treedepth__ 3.0 5.4e-03 0.15 3.0 3.0 3.0 771 28545 1.00
n_leapfrog__ 7.0 1.4e-02 0.42 7.0 7.0 7.0 862 31940 1.0
divergent__ 0.00 nan 0.00 0.00 0.00 0.00 nan nan nan
energy__ 19 2.4e-01 4.4 13 19 28 351 12988 1.0
x.1 -2.2e-02 0.022 0.99 -1.6 4.0e-03 1.6 1998 7.4e+04 1.00
x.2[1,1] -5.6e-02 0.090 4.2 -7.0 -2.0e-01 7.0 2225 8.2e+04 1.0
x.2[1,2] 2.9e-02 0.087 4.0 -6.7 -7.4e-02 6.6 2151 8.0e+04 1.00
x.2[1,3] -2.2e-02 0.090 4.0 -6.5 -8.4e-02 6.6 1955 7.2e+04 1.00
x.2[2,1] 2.0e-02 0.098 3.9 -6.6 -7.2e-02 6.4 1611 6.0e+04 1.00
x.2[2,2] -1.5e-01 0.092 4.1 -6.8 -2.2e-01 6.6 2021 7.5e+04 1.00
x.2[2,3] 1.0e-01 0.090 3.8 -6.2 1.6e-01 6.4 1787 6.6e+04 1.0
y[1,1].1 4.0e-02 0.023 0.99 -1.6 6.9e-02 1.6 1823 6.8e+04 1.0
y[2,1].2 3.7e-02 0.056 3.1 -5.0 3.3e-03 5.1 3000 1.1e+05 1.00
y[1,1].2 7.3e-03 0.061 2.9 -4.8 5.5e-02 4.9 2302 8.5e+04 1.00
y[3,1].1 5.8e-03 0.021 0.97 -1.6 3.5e-02 1.6 2190 8.1e+04 1.00
y[2,1].1 -1.8e-02 0.021 1.1 -1.7 1.0e-02 1.7 2507 9.3e+04 1.00
y[3,1].2 2.5e-03 0.069 3.2 -5.4 1.9e-02 5.3 2184 8.1e+04 1.00
y[1,2].1 2.2e-02 0.041 2.0 -3.2 3.5e-02 3.3 2381 8.8e+04 1.00
y[2,2].2 -1.4e-02 0.091 4.2 -6.8 -1.6e-01 6.8 2139 7.9e+04 1.00
y[1,2].2 -3.5e-02 0.089 4.0 -6.3 9.7e-02 6.5 2057 7.6e+04 1.00
y[3,2].1 1.5e-02 0.040 1.9 -3.1 4.2e-02 3.0 2323 8.6e+04 1.00
y[2,2].1 -7.0e-02 0.036 2.0 -3.4 -1.2e-01 3.4 3000 1.1e+05 1.00
y[3,2].2 -9.3e-02 0.083 4.0 -6.7 -1.7e-02 6.2 2280 8.4e+04 1.00
Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).
This PR:
Error during processing. Unknown parameter name y[1,1]
auto find_dups | ||
= std::find(invalid_params.begin(), invalid_params.end(), request); | ||
if (find_dups == invalid_params.end()) { | ||
invalid_params.emplace_back(request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making invalid a set
to avoid this?
I'm hoping the above is the last issue, but it's hard to say. I don't know if it is necessary to touch the code as much as this PR does -- it seems to me like, in the abstract, basically only changing the get_header and get_stats functions could be enough? And then it would not require all the additional review on the reordering etc, while maintaining existing behavior |
Here's an example of what I mean: https://github.com/stan-dev/cmdstan/compare/stansummary-new-stats Before updating the tests, this diff is only at +59-76, and it adds the ess_bulk, ess_tail, MAD, and new R-Hat to stansummary. If you do want to stop plugging away on this I'd be happy to push it further |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
hi Brian, you're welcome to take over. |
Also fix CSV precision bug in matrix outputs
Submisison Checklist
./runCmdStanTests.py src/test
Summary:
Update utility
stansummary
to usestan::mcmc::chainset
objects (stan-dev/stan#3313) instead ofstan::mcmc::chains
. The chainset object provides C++ implementations of split rank-normalized bulk and tail ESS and Rhat, as well as max_abs_deviation, MCSE_mean, and MCSE_sd, all of which have been added to thestansummary
report.Also updated unit tests which use
stan::mcmc::chains
to usestan::mcmc::chainset
instead.As part of this PR, removed the
print
utility, which has been advertised as deprecated for a very long time.Intended Effect:
Improved convergence diagnostics for
stansummary
, which is used by CmdStanPy and CmdStanR.How to Verify:
Unit tests
Side Effects:
Output of
bin/stansummary
will have different set of columns, and slightly different messages.Documentation:
Separate PR
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: